Skip to content

tests: remove flash validation error failure condition#83

Merged
kr-t merged 2 commits intoproject-ocre:mainfrom
PatrickRobbIOL:test-error-condition-removal
Oct 21, 2025
Merged

tests: remove flash validation error failure condition#83
kr-t merged 2 commits intoproject-ocre:mainfrom
PatrickRobbIOL:test-error-condition-removal

Conversation

@PatrickRobbIOL
Copy link
Collaborator

@PatrickRobbIOL PatrickRobbIOL commented Oct 8, 2025

Description

Previously, some tests within the flash validation testsuite were set to fail when the error prefix was detected. This is now removed per instruction from the Ocre community developers. Now, the flash validation testsuite includes only one testcase which just checks for the expected hello world string and fails if it is not detected.

Type of change

Please delete options that are not relevant.

  • CI system update
  • Test Coverage update

How Has This Been Tested?

This will be tested when the PR is opened, given that it is a change to how the GHA workflows are set up.

Checklist:

  • I have performed a self-review of my own code

@kr-t
Copy link
Collaborator

kr-t commented Oct 9, 2025

Hi @PatrickRobbIOL , shouldn't this now be green: https://github.com/project-ocre/ocre-runtime/actions/runs/18349248688
Can you check please, why it isn't successful?

@PatrickRobbIOL
Copy link
Collaborator Author

Hi @PatrickRobbIOL , shouldn't this now be green: https://github.com/project-ocre/ocre-runtime/actions/runs/18349248688 Can you check please, why it isn't successful?

Hi Krystian, yes tests.yml should be passing now. I see 2 issues:

  1. All 3 tests are running. This is not right, as 2 of the 3 tests have been removed from the flashValidation group, and also removed from the config.json for the flashValidation group. I am trying to figure out the reason. It may be that the checkout stage is checking out to the main branch instead of the feature branch which is the source for the PR. I will send a new commit to test this.

  2. Hello World is not in the "stdout" when a break is sent for the b_u5 board. I'm not sure why this is happening - when we first introduced the tests the hello world container was running on startup. This is what the logs for the tests.yml run show:


Beginning test group Flash Validation
Test the Ocre runtime after the flash is complete

Entering Setup...

Script: Flash Validation Setup

Setup is complete

Entering Testing...

Test Suite: Runtime Validation Tests

Test Case: Check Runtime Hello World

**** READING RESPONSE FROM BREAK ****

I: OSPI flash config is OPI / DTR
I: Read SFDP from octoFlash
I: Read SFDP from octoFlash
I: W5500 Initialized
I: chip id 0x6b
I: pressure: int on gpio@42021800.02
*** Booting Zephyr OS build 8d0d392f8cc7 ***
Waiting for network to be ready...

ocre:~$
**** CLOSING CONNECTION ****

TEST FAILED


I wonder, could this be the result of networking being introduced? After a certain timeout, will the "waiting for network to be ready..." timeout and the Hello World container will be run? Or perhaps the build procedure in build.yml was modified in the past months to no longer include the Hello World container by default. Anyhow, I will try to figure out why the Hello World container is not running anymore.

@kr-t
Copy link
Collaborator

kr-t commented Oct 13, 2025

Hi @PatrickRobbIOL , the networking is implemented in a way that it shall not be a breaking change. If we don't have internet, we just print it and go on with the flow. Check this log:
image
This is an U585 without the internet connection - as you can see, after being Unable to connect to network, we still run the container.

@PatrickRobbIOL PatrickRobbIOL force-pushed the test-error-condition-removal branch 3 times, most recently from 903d6cb to 5cd30f5 Compare October 13, 2025 17:53
@PatrickRobbIOL
Copy link
Collaborator Author

Hi @PatrickRobbIOL , the networking is implemented in a way that it shall not be a breaking change. If we don't have internet, we just print it and go on with the flow. Check this log:

Thank you. My PR is now passing the hello world test. The reason it was not working is because it was reading the stdout when the board was halfway through initializing. So, it was capturing something from the first half (the networking setup) but it was too early to capture the "Hello World from Ocre." I think the reason why this used to work with the old timeout is that less was happening during the ocre init, and now that more is happening, a longer sleep is needed.

I also had to move the tests from tests.yml to build.yml because of restrictions related to workflow_run.

@kr-t
Copy link
Collaborator

kr-t commented Oct 14, 2025

I can see that hello world passes. Shall we remove the test.yml then? Because as of now we still have them failing, which we probably don't want as our default state. Or would you suggest changing something else? Thanks!

@PatrickRobbIOL
Copy link
Collaborator Author

I can see that hello world passes. Shall we remove the test.yml then? Because as of now we still have them failing, which we probably don't want as our default state. Or would you suggest changing something else? Thanks!

Hi. Right, tests.yml is being removed in this PR. This is desired from my perspective as I want to move the invocation of the tests to build.yml.

The tests.yml test is still running (until this PR is merged) because it runs based on the main branch, not the PR source branch.

kr-t
kr-t previously approved these changes Oct 15, 2025
@kr-t
Copy link
Collaborator

kr-t commented Oct 15, 2025

Hi @PatrickRobbIOL I'd merge it, but you have to have verified signature on github, otherwise it isn't letting me.
https://docs.github.com/en/authentication/managing-commit-signature-verification/displaying-verification-statuses-for-all-of-your-commits

@PatrickRobbIOL PatrickRobbIOL force-pushed the test-error-condition-removal branch 3 times, most recently from 20dbeca to adc41ad Compare October 17, 2025 20:59
Previously, some tests within the flash validation testsuite
were set to fail when the error prefix was detected. This is
now removed per instruction from the Ocre community developers.
Now, the flash validation testsuite includes only one testcase
which just checks for the expected hello world string and fails
if it is not detected.

Signed-off-by: Patrick Robb <probb@iol.unh.edu>
@PatrickRobbIOL PatrickRobbIOL force-pushed the test-error-condition-removal branch from adc41ad to 84e081c Compare October 17, 2025 21:07
@PatrickRobbIOL
Copy link
Collaborator Author

Thanks. I actually wasn't aware of key signing git commits. Cool!

So, this PR should be unblocked now. If we can merge this, early next week Matt and I can send in some new testcases. He showed me his testcase for the modbus server today and it looks good.

@kr-t
Copy link
Collaborator

kr-t commented Oct 20, 2025

Hi @PatrickRobbIOL , the flashing test isn't working anymore. I think there is nothing wrong with the test itself, but the way we flash the board. If you check the earlier test, it was fine (on right) - but now somehow it isn't loading the correct image (left). Check the sizes of the containers - on right it's the correct 3850, on left some giant size of something else. I see two possible problems: either you tested something else, and the debugger aren't clearing correctly the flash - or you have some other zephyr.bin which it flashes instead if the image it should.
image

P.S. I have downloaded and tested the artifact from this PR - it works fine on my board. So something is messed up either during the automatic download/storage or the flash itself.
I'd suggest rewriting the flash command to:
STM32_Programmer_CLI -c port=swd -e all -w zephyr.bin 0x08000000 -v -rst
-e all would erase the flash before writing it, this might solve the issue if there is some garbage left.

We need to rely on this test and on the board, otherwise no pull request will be able to pass it if it won't work deterministically.

@PatrickRobbIOL
Copy link
Collaborator Author

Hi @PatrickRobbIOL , the flashing test isn't working anymore. I think there is nothing wrong with the test itself, but the way we flash the board. If you check the earlier test, it was fine (on right) - but now somehow it isn't loading the correct image (left). Check the sizes of the containers - on right it's the correct 3850, on left some giant size of something else. I see two possible problems: either you tested something else, and the debugger aren't clearing correctly the flash - or you have some other zephyr.bin which it flashes instead if the image it should. image

P.S. I have downloaded and tested the artifact from this PR - it works fine on my board. So something is messed up either during the automatic download/storage or the flash itself. I'd suggest rewriting the flash command to: STM32_Programmer_CLI -c port=swd -e all -w zephyr.bin 0x08000000 -v -rst -e all would erase the flash before writing it, this might solve the issue if there is some garbage left.

We need to rely on this test and on the board, otherwise no pull request will be able to pass it if it won't work deterministically.

Thanks, yes I looked this morning and I saw the new logs, and that instead of starting the Hello World container, it says the container fails to start. Matt will look at it today.

Matt, first things first let me update the PR so that it uses Krystian's recommended command.

Previously, the test groups were invoked within a dedicated tests
workflow, triggered by a workflow_run event. This did not work well
because workflows triggered by workflow_run are forced to run from the
main branch as a security measure. So, changes to the test workflow
coming in from a pull request were not used in the CI testing.

This commit also includes an increased sleep on the flash validation
test, allowing it to properly initialize the hello world container and
fill the output buffer with the expected string.

Signed-off-by: Patrick Robb <probb@iol.unh.edu>
@kr-t
Copy link
Collaborator

kr-t commented Oct 21, 2025

Hi Patrick, it seems that command might have worked - new commit passes with no issues.
Let me retrigger the actions to check if they would pass once more - if yes, we can merge this and focus on new stuff.
image

@kr-t
Copy link
Collaborator

kr-t commented Oct 21, 2025

It seems that it is working, even though I had to retrigger it again. Let's deliver it, but I propose that in the future we review our testing and separate them into multiple files/jobs.

@kr-t kr-t merged commit 90be9a4 into project-ocre:main Oct 21, 2025
33 of 34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants